by Robert A. Stevens, February 24, 2014
Adpated from “Modeling and Analysis of Row Crop Harvesting Patterns by Combines” by A.C. Hansen, Q. Zhang, and T.A. Wilcox, Transactions of the ASABE, Vol. 50(1):5-12 (2007)
Table 1. Summary of equipment, field details, and electronic flags.
Figure 1. Harvesting patterns for single combine in a row crop.
Corn-harvesting.xls (2 tabs)
Corn-harvesting-CAN.csv (CAN tab)
Combine1.pptx
Corn-Harvest-2.xls
Corn-Harvest-2-CAN.csv
Combine2a.pptx (as delivered - records out of time order)
Combine2b.pptx (fixed - records in time order)
LONGITUDE
LATITUDE
TIME
STATUS (head position: 1 = down, 0 = up)
DRY_YIELD (bushels/acre)
DAYOFMONTH
HOUR
MINUTE
SECOND
SPEED (ground speed in mph)
TORQUE (% of torque at rated)
ENG_SPD (rpm)
CURVE (unloading auger: 1 = off, 2 = engaged)
FUEL (% of maximum?)
FUEL_KG.HR (fuel rate in kg/hour)
GRD_SPD (ground speed in km/hour)
MPH (ground speed in mph)
Load the necessary R packages.
library(ggplot2) # Used for several of the plots
library(maps) # Required for correct Lon/Lat projections for ggplot2 coord_map()
library(plyr) # Used for sorting dataframe to make sure in the correct order
library(ggmap) # Used for Google Earth style plots
library(rpart) # Recursive Partitioning data mining routine
library(rpart.plot) # Improves the rpart plots
Code to add footnotes to all of the plots from “R: Good practice - adding footnotes to graphics”
http://ryouready.wordpress.com/2009/02/17/r-good-practice-adding-footnotes-to-graphics/
# basic information at the beginning of each script
sourceText <- "Source: 'Modeling and Analysis of Row Crop Harvesting Patterns by Combines' by Hansen et al."
scriptName <- "Script: CombineV2.Rmd"
author <- "RAStevens"
footnote <- paste(sourceText, scriptName, format(Sys.time(), "%d %b %Y"), author,
sep = " / ")
# default footnote is today's date, cex = 0.7 (size) and color is a kind of
# grey
makeFootnote <- function(footnoteText = format(Sys.time(), "%d %b %Y"), size = 0.7,
color = grey(0.5)) {
require(grid)
pushViewport(viewport())
grid.text(label = footnoteText, x = unit(1, "npc") - unit(2, "mm"), y = unit(2,
"mm"), just = c("right", "bottom"), gp = gpar(cex = size, col = color))
popViewport()
}
Load, prepare, and merge the two data sets.
setwd("C:\\Users\\rs62172\\Documents\\Corn\\") # user specific - need to edit to run
filename <- "Corn-harvesting-CAN.csv"
Combine1 <- read.csv(filename, header = TRUE, as.is = TRUE)
filename <- "Corn-Harvest-2-CAN.csv"
Combine2 <- read.csv(filename, header = TRUE, as.is = TRUE)
Combine2 <- arrange(Combine2, DAYOFMONTH, HOUR, MINUTE, SECOND) # put in the right time order
Combine1 <- Combine1[, -c(3:4)] # remove Easting and Northing
dim(Combine1)
## [1] 12013 17
Combine2 <- Combine2[, -c(11)] # remove TIMELAPSE
dim(Combine2)
## [1] 11519 17
Combine1$Combine <- 1
Combine2$Combine <- 2
Combine12 <- rbind(Combine1, Combine2)
Combine12$Combine <- as.factor(Combine12$Combine)
dim(Combine12)
## [1] 23532 18
mapLon <- median(Combine12$LONGITUDE)
mapLat <- median(Combine12$LATITUDE)
The location is:
revgeocode(c(mapLon, mapLat))
## Information from URL : http://maps.googleapis.com/maps/api/geocode/json?latlng=40.28414,-89.15606&sensor=false
## Google Maps API Terms of Service : http://developers.google.com/maps/terms
## [1] "2313-2399 Old Principal Road, McLean, IL 61754, USA"
Use ggmap with zoom values 10 to 17 to get an idea of where this is:
plotTitle <- "Combine 1 and Combine 2 Paths using ggmap with zoom = 10"
map <- get_googlemap(center = c(lon = mapLon, lat = mapLat), zoom = 10, maptype = "hybrid")
## Map from URL : http://maps.googleapis.com/maps/api/staticmap?center=40.28414,-89.15606&zoom=10&size=%20640x640&maptype=hybrid&sensor=false
## Google Maps API Terms of Service : http://developers.google.com/maps/terms
ggmap(map) + geom_path(data = Combine12, aes(x = LONGITUDE, y = LATITUDE, color = Combine),
size = I(1)) + labs(title = paste(plotTitle), x = "Longitude", y = "Latitude")
makeFootnote(footnote)
## Loading required package: grid
plotTitle <- "Combine 1 and Combine 2 Paths using ggmap with zoom = 11"
map <- get_googlemap(center = c(lon = mapLon, lat = mapLat), zoom = 11, maptype = "hybrid")
## Map from URL : http://maps.googleapis.com/maps/api/staticmap?center=40.28414,-89.15606&zoom=11&size=%20640x640&maptype=hybrid&sensor=false
## Google Maps API Terms of Service : http://developers.google.com/maps/terms
ggmap(map) + geom_path(data = Combine12, aes(x = LONGITUDE, y = LATITUDE, color = Combine),
size = I(1)) + labs(title = paste(plotTitle), x = "Longitude", y = "Latitude")
makeFootnote(footnote)
plotTitle <- "Combine 1 and Combine 2 Paths using ggmap with zoom = 12"
map <- get_googlemap(center = c(lon = mapLon, lat = mapLat), zoom = 12, maptype = "hybrid")
## Map from URL : http://maps.googleapis.com/maps/api/staticmap?center=40.28414,-89.15606&zoom=12&size=%20640x640&maptype=hybrid&sensor=false
## Google Maps API Terms of Service : http://developers.google.com/maps/terms
ggmap(map) + geom_path(data = Combine12, aes(x = LONGITUDE, y = LATITUDE, color = Combine),
size = I(1)) + labs(title = paste(plotTitle), x = "Longitude", y = "Latitude")
makeFootnote(footnote)
plotTitle <- "Combine 1 and Combine 2 Paths using ggmap with zoom = 13"
map <- get_googlemap(center = c(lon = mapLon, lat = mapLat), zoom = 13, maptype = "hybrid")
## Map from URL : http://maps.googleapis.com/maps/api/staticmap?center=40.28414,-89.15606&zoom=13&size=%20640x640&maptype=hybrid&sensor=false
## Google Maps API Terms of Service : http://developers.google.com/maps/terms
ggmap(map) + geom_path(data = Combine12, aes(x = LONGITUDE, y = LATITUDE, color = Combine),
size = I(1)) + labs(title = paste(plotTitle), x = "Longitude", y = "Latitude")
makeFootnote(footnote)
plotTitle <- "Combine 1 and Combine 2 Paths using ggmap with zoom = 14"
map <- get_googlemap(center = c(lon = mapLon, lat = mapLat), zoom = 14, maptype = "hybrid")
## Map from URL : http://maps.googleapis.com/maps/api/staticmap?center=40.28414,-89.15606&zoom=14&size=%20640x640&maptype=hybrid&sensor=false
## Google Maps API Terms of Service : http://developers.google.com/maps/terms
ggmap(map) + geom_path(data = Combine12, aes(x = LONGITUDE, y = LATITUDE, color = Combine),
size = I(1)) + labs(title = paste(plotTitle), x = "Longitude", y = "Latitude")
makeFootnote(footnote)
plotTitle <- "Combine 1 and Combine 2 Paths using ggmap with zoom = 15"
map <- get_googlemap(center = c(lon = mapLon, lat = mapLat), zoom = 15, maptype = "hybrid")
## Map from URL : http://maps.googleapis.com/maps/api/staticmap?center=40.28414,-89.15606&zoom=15&size=%20640x640&maptype=hybrid&sensor=false
## Google Maps API Terms of Service : http://developers.google.com/maps/terms
ggmap(map) + geom_path(data = Combine12, aes(x = LONGITUDE, y = LATITUDE, color = Combine),
size = I(1)) + labs(title = paste(plotTitle), x = "Longitude", y = "Latitude")
makeFootnote(footnote)
plotTitle <- "Combine 1 and Combine 2 Paths using ggmap with zoom = 16"
map <- get_googlemap(center = c(lon = mapLon, lat = mapLat), zoom = 16, maptype = "hybrid")
## Map from URL : http://maps.googleapis.com/maps/api/staticmap?center=40.28414,-89.15606&zoom=16&size=%20640x640&maptype=hybrid&sensor=false
## Google Maps API Terms of Service : http://developers.google.com/maps/terms
ggmap(map) + geom_path(data = Combine12, aes(x = LONGITUDE, y = LATITUDE, color = Combine),
size = I(1)) + labs(title = paste(plotTitle), x = "Longitude", y = "Latitude")
makeFootnote(footnote)
plotTitle <- "Combine 1 and Combine 2 Paths using ggmap with zoom = 17"
map <- get_googlemap(center = c(lon = mapLon, lat = mapLat), zoom = 17, maptype = "hybrid")
## Map from URL : http://maps.googleapis.com/maps/api/staticmap?center=40.28414,-89.15606&zoom=17&size=%20640x640&maptype=hybrid&sensor=false
## Google Maps API Terms of Service : http://developers.google.com/maps/terms
ggmap(map) + geom_path(data = Combine12, aes(x = LONGITUDE, y = LATITUDE, color = Combine),
size = I(1)) + labs(title = paste(plotTitle), x = "Longitude", y = "Latitude")
makeFootnote(footnote)
Explore the data using combine paths and color coding for other variables:
Combine12$Combine <- as.factor(Combine12$Combine)
plotTitle <- "Combine 1 and Combine 2 Paths"
qplot(LONGITUDE, LATITUDE, data = Combine12, geom = "point", color = Combine,
size = I(2)) + scale_x_continuous("Longitude") + scale_y_continuous("Latitude") +
coord_map() + labs(title = paste(plotTitle), x = "Longitude", y = "Latitude")
makeFootnote(footnote)
plotTitle <- "Harvest Pattern for Combines 1 and 2 with CURVE (unloading auger: 1 = off, 2 = engaged)"
Combine12$CURVE <- as.factor(Combine12$CURVE)
qplot(LONGITUDE, LATITUDE, data = Combine12, geom = "point", color = CURVE,
size = I(2)) + scale_x_continuous("Longitude") + scale_y_continuous("Latitude") +
coord_map() + labs(title = paste(plotTitle)) + facet_grid(. ~ Combine)
makeFootnote(footnote)
plotTitle <- "Harvest Pattern for Combines 1 and 2 with STATUS (head position: 1 = down, 0 = up)"
Combine12$STATUS <- as.factor(Combine12$STATUS)
qplot(LONGITUDE, LATITUDE, data = Combine12, geom = "point", color = STATUS,
size = I(2)) + scale_x_continuous("Longitude") + scale_y_continuous("Latitude") +
coord_map() + labs(title = paste(plotTitle)) + facet_grid(. ~ Combine)
makeFootnote(footnote)
plotTitle <- "Harvest Pattern for Combines 1 and 2 with Dry Yield (bushels/acre)"
qplot(LONGITUDE, LATITUDE, data = Combine12, geom = "point", color = DRY_YIELD,
size = I(2)) + scale_x_continuous("Longitude") + scale_y_continuous("Latitude") +
coord_map() + labs(title = paste(plotTitle)) + facet_grid(. ~ Combine) +
scale_colour_gradientn(colours = rainbow(7))
makeFootnote(footnote)
plotTitle <- "Dry Yield Histograms (Combine 1 = Red, Combine 2 = Blue)"
ggplot(Combine12, aes(x = DRY_YIELD)) + geom_histogram(data = subset(Combine12,
Combine == 1), fill = "red", alpha = 0.2) + geom_histogram(data = subset(Combine12,
Combine == 2), fill = "blue", alpha = 0.2) + labs(title = paste(plotTitle),
x = "Dry Yield (bushels/acre)") + theme_bw()
## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.
## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.
makeFootnote(footnote)
plotTitle <- "Harvest Pattern for Combines 1 and 2 with Engine Speed (rpm)"
qplot(LONGITUDE, LATITUDE, data = Combine12, geom = "point", color = ENG_SPD,
size = I(2)) + scale_x_continuous("Longitude") + scale_y_continuous("Latitude") +
coord_map() + labs(title = paste(plotTitle)) + facet_grid(. ~ Combine) +
scale_colour_gradientn(colours = rainbow(7))
makeFootnote(footnote)
plotTitle <- "Engine Speed Histograms (Combine 1 = Red, Combine 2 = Blue)"
ggplot(Combine12, aes(x = ENG_SPD)) + geom_histogram(data = subset(Combine12,
Combine == 1), fill = "red", alpha = 0.2) + geom_histogram(data = subset(Combine12,
Combine == 2), fill = "blue", alpha = 0.2) + labs(title = paste(plotTitle),
x = "Engine Speed (rpm)") + theme_bw() + xlim(2100, 2500)
## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.
## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.
## Warning: position_stack requires constant width: output may be incorrect
## Warning: position_stack requires constant width: output may be incorrect
makeFootnote(footnote)
plotTitle <- "Harvest Pattern for Combines 1 and 2 with TORQUE (% of torque at rated)"
qplot(LONGITUDE, LATITUDE, data = Combine12, geom = "point", color = TORQUE,
size = I(2)) + scale_x_continuous("Longitude") + scale_y_continuous("Latitude") +
coord_map() + labs(title = paste(plotTitle)) + facet_grid(. ~ Combine) +
scale_colour_gradientn(colours = rainbow(7))
makeFootnote(footnote)
plotTitle <- "TORQUE Histograms (Combine 1 = Red, Combine 2 = Blue)"
ggplot(Combine12, aes(x = TORQUE)) + geom_histogram(data = subset(Combine12,
Combine == 1), fill = "red", alpha = 0.2) + geom_histogram(data = subset(Combine12,
Combine == 2), fill = "blue", alpha = 0.2) + labs(title = paste(plotTitle),
x = "TORQUE (% of torque at rated)") + theme_bw()
## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.
## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.
makeFootnote(footnote)
plotTitle <- "Harvest Pattern for Combines 1 and 2 with Ground Speed (mph)"
qplot(LONGITUDE, LATITUDE, data = Combine12, geom = "point", color = MPH, size = I(2)) +
scale_x_continuous("Longitude") + scale_y_continuous("Latitude") + coord_map() +
labs(title = paste(plotTitle)) + facet_grid(. ~ Combine) + scale_colour_gradientn(colours = rainbow(7))
makeFootnote(footnote)
plotTitle <- "Ground Speed Histograms (Combine 1 = Red, Combine 2 = Blue)"
ggplot(Combine12, aes(x = MPH)) + geom_histogram(data = subset(Combine12, Combine ==
1), fill = "red", alpha = 0.2) + geom_histogram(data = subset(Combine12,
Combine == 2), fill = "blue", alpha = 0.2) + labs(title = paste(plotTitle),
x = "Ground Speed (mph)") + theme_bw()
## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.
## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.
makeFootnote(footnote)
plotTitle <- "Harvest Pattern for Combines 1 and 2 with Fuel Rate (kg/hour)"
qplot(LONGITUDE, LATITUDE, data = Combine12, geom = "point", color = FUEL_KG.HR,
size = I(2)) + scale_x_continuous("Longitude") + scale_y_continuous("Latitude") +
coord_map() + labs(title = paste(plotTitle)) + facet_grid(. ~ Combine) +
scale_colour_gradientn(colours = rainbow(7))
makeFootnote(footnote)
plotTitle <- "Fuel Rate Histograms (Combine 1 = Red, Combine 2 = Blue)"
ggplot(Combine12, aes(x = FUEL_KG.HR)) + geom_histogram(data = subset(Combine12,
Combine == 1), fill = "red", alpha = 0.2) + geom_histogram(data = subset(Combine12,
Combine == 2), fill = "blue", alpha = 0.2) + labs(title = paste(plotTitle),
x = "Fuel Rate (kg/hour)") + theme_bw()
## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.
## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.
makeFootnote(footnote)
Now try some data mining with Recursive Partitioning (rpart):
FuelRate.rp <- rpart(FUEL_KG.HR ~ STATUS + DRY_YIELD + TORQUE + ENG_SPD + CURVE +
MPH + Combine, data = Combine12, cp = 0.001)
plotcp(FuelRate.rp)
makeFootnote(footnote)
The last node just crosses the threhold, so no need to prune the tree.
FuelRate.rp
## n= 23532
##
## node), split, n, deviance, yval
## * denotes terminal node
##
## 1) root 23532 593500.00 29.320
## 2) TORQUE< 65.5 11059 163200.00 25.490
## 4) TORQUE< 55.5 2914 36140.00 21.000
## 8) TORQUE< 46.5 1028 11760.00 18.080
## 16) ENG_SPD< 1985 20 101.50 2.731 *
## 17) ENG_SPD>=1985 1008 6857.00 18.380
## 34) ENG_SPD< 2372 879 3652.00 17.850
## 68) TORQUE< 40.5 339 1096.00 16.320 *
## 69) TORQUE>=40.5 540 1265.00 18.810 *
## 35) ENG_SPD>=2372 129 1267.00 22.000 *
## 9) TORQUE>=46.5 1886 10780.00 22.600
## 18) ENG_SPD< 2316 210 1251.00 19.060 *
## 19) ENG_SPD>=2316 1676 6567.00 23.040
## 38) TORQUE< 51.5 688 2570.00 21.970
## 76) ENG_SPD< 2354 489 673.60 21.210 *
## 77) ENG_SPD>=2354 199 902.40 23.860 *
## 39) TORQUE>=51.5 988 2662.00 23.790
## 78) CURVE=1 927 1672.00 23.540 *
## 79) CURVE=2 61 63.55 27.560 *
## 5) TORQUE>=55.5 8145 47430.00 27.090
## 10) CURVE=1 7014 20650.00 26.400
## 20) TORQUE< 60.5 2825 6470.00 25.210
## 40) ENG_SPD< 2288 30 202.70 18.890 *
## 41) ENG_SPD>=2288 2795 5054.00 25.280
## 82) ENG_SPD< 2364 2733 3537.00 25.190
## 164) TORQUE< 57.5 935 955.00 24.410 *
## 165) TORQUE>=57.5 1798 1717.00 25.600 *
## 83) ENG_SPD>=2364 62 640.50 29.000 *
## 21) TORQUE>=60.5 4189 7474.00 27.200
## 42) ENG_SPD< 2272 28 277.70 19.850 *
## 43) ENG_SPD>=2272 4161 5674.00 27.250
## 86) TORQUE< 63.5 2478 2737.00 26.850
## 172) ENG_SPD< 2324 1291 901.10 26.350 *
## 173) ENG_SPD>=2324 1187 1150.00 27.400 *
## 87) TORQUE>=63.5 1683 1951.00 27.840 *
## 11) CURVE=2 1131 2450.00 31.400
## 22) TORQUE< 61.5 405 549.20 30.100 *
## 23) TORQUE>=61.5 726 832.20 32.130 *
## 3) TORQUE>=65.5 12473 123600.00 32.720
## 6) CURVE=1 10326 71650.00 31.930
## 12) TORQUE< 73.5 4500 13980.00 29.870
## 24) TORQUE< 69.5 2511 5360.00 29.050
## 48) ENG_SPD< 2306 916 1804.00 28.180
## 96) ENG_SPD< 2253 12 112.60 20.320 *
## 97) ENG_SPD>=2253 904 941.10 28.280 *
## 49) ENG_SPD>=2306 1595 2448.00 29.560 *
## 25) TORQUE>=69.5 1989 4807.00 30.910
## 50) ENG_SPD< 2294 698 1686.00 29.880 *
## 51) ENG_SPD>=2294 1291 1980.00 31.470 *
## 13) TORQUE>=73.5 5826 23880.00 33.520
## 26) TORQUE< 79.5 3890 10170.00 32.880
## 52) ENG_SPD< 2288 1859 4130.00 32.060
## 104) ENG_SPD< 2230 14 123.80 23.970 *
## 105) ENG_SPD>=2230 1845 3083.00 32.120
## 210) TORQUE< 76.5 767 1122.00 31.380 *
## 211) TORQUE>=76.5 1078 1243.00 32.650 *
## 53) ENG_SPD>=2288 2031 3657.00 33.620
## 106) TORQUE< 76.5 1239 1717.00 33.100 *
## 107) TORQUE>=76.5 792 1061.00 34.450 *
## 27) TORQUE>=79.5 1936 8829.00 34.820
## 54) ENG_SPD< 2282 1312 5552.00 34.230
## 108) TORQUE< 83.5 865 1963.00 33.650 *
## 109) TORQUE>=83.5 447 2729.00 35.360 *
## 55) ENG_SPD>=2282 624 1878.00 36.050 *
## 7) CURVE=2 2147 14440.00 36.530
## 14) TORQUE< 72.5 1028 2522.00 34.520
## 28) TORQUE< 69.5 671 963.40 33.840 *
## 29) TORQUE>=69.5 357 665.90 35.800 *
## 15) TORQUE>=72.5 1119 3975.00 38.370
## 30) TORQUE< 75.5 392 803.50 37.190 *
## 31) TORQUE>=75.5 727 2336.00 39.000 *
Note: the only variables that appear are TORQUE, ENG_SPD and CURVE (unloading auger: 1 = off, 2 = engaged).
plotTitle <- "rpart Results for Fuel Rate (kg/hour)"
plot(FuelRate.rp, main = plotTitle)
text(FuelRate.rp, digits = 4, cex = 0.65)
makeFootnote(footnote)
That's hard to read, so try “prp” (pretty rpart):
plotTitle <- "rpart Results for Fuel Rate (kg/hour) using prp"
prp(FuelRate.rp, main = plotTitle)
makeFootnote(footnote)
That's better, but still difficult to explain, so try enhancing prp:
plotTitle <- "rpart Results for Fuel Rate (kg/hour) using prp Enhanced"
prp(FuelRate.rp, type = 2, nn = TRUE, fallen.leaves = TRUE, faclen = 0, varlen = 0,
shadow.col = "grey", branch.lty = 3, main = plotTitle)
makeFootnote(footnote)
Now plot all 4 variables on one plot: ENG_SPD, TORQUE, CURVE and Fuel Rate. Played around with ggplot2's “alpha” and “size” to create a pseudo contour plot.
plotTitle <- "ENG_SPD versus TORQUE by CURVE (unloading auger: 1 = off, 2 = engaged)"
qplot(ENG_SPD, TORQUE, data = Combine12, color = FUEL_KG.HR, alpha = I(1/4),
size = I(4)) + facet_grid(. ~ CURVE) + labs(title = paste(plotTitle)) +
scale_colour_gradientn(colours = rainbow(7)) + theme_bw()
makeFootnote(footnote)
Even though Combine didn't show up in the data mining results, put in in anyway to see what it looks like.
plotTitle <- "ENG_SPD versus TORQUE by CURVE (columns) and Combine (rows)"
qplot(ENG_SPD, TORQUE, data = Combine12, color = FUEL_KG.HR, alpha = I(1/4),
size = I(4)) + facet_grid(Combine ~ CURVE) + labs(title = paste(plotTitle)) +
scale_colour_gradientn(colours = rainbow(7)) + theme_bw()
makeFootnote(footnote)
It seems the two combines performed differently because they had different levels of Engine Speed and Torque - but why?
Calculate productivity metrics (e.g. tons/hour): Need to “normalize” the data
Add Elevation/Altitude and other environmental variables
Predictive Modeling - interactions? nonlinear?
Animations: R, Tableau, other.
Why are Combines 1 and 2 different? Machine or Operator?
Or maybe there were data collection errors?